Analysis of Sequence Conservation at Nucleotide Resolution
نویسندگان
چکیده
One of the major goals of comparative genomics is to understand the evolutionary history of each nucleotide in the human genome sequence, and the degree to which it is under selective pressure. Ascertainment of selective constraint at nucleotide resolution is particularly important for predicting the functional significance of human genetic variation and for analyzing the sequence substructure of cis-regulatory sequences and other functional elements. Current methods for analysis of sequence conservation are focused on delineation of conserved regions comprising tens or even hundreds of consecutive nucleotides. We therefore developed a novel computational approach designed specifically for scoring evolutionary conservation at individual base-pair resolution. Our approach estimates the rate at which each nucleotide position is evolving, computes the probability of neutrality given this rate estimate, and summarizes the result in a Sequence CONservation Evaluation (SCONE) score. We computed SCONE scores in a continuous fashion across 1% of the human genome for which high-quality sequence information from up to 23 genomes are available. We show that SCONE scores are clearly correlated with the allele frequency of human polymorphisms in both coding and noncoding regions. We find that the majority of noncoding conserved nucleotides lie outside of longer conserved elements predicted by other conservation analyses, and are experiencing ongoing selection in modern humans as evident from the allele frequency spectrum of human polymorphism. We also applied SCONE to analyze the distribution of conserved nucleotides within functional regions. These regions are markedly enriched in individually conserved positions and short (<15 bp) conserved "chunks." Our results collectively suggest that the majority of functionally important noncoding conserved positions are highly fragmented and reside outside of canonically defined long conserved noncoding sequences. A small subset of these fragmented positions may be identified with high confidence.
منابع مشابه
Phylogenetic analysis of HSP70 gene of Aspergillus fumigatus reveals conservation intra-species and divergence inter-species
Aspergillus fumigatus is a saprophyte fungus, widely spread in a variety of ecologicalniches and the most prevalent aspergilli responsible for human and animal invasiveaspergillosis. The first step to develop novel and efficient therapies is the identificationand understanding of the key tolerance and virulence factors of pathogens. The mainfocus of the present study is to perform the similarit...
متن کاملCloning and sequence analysis of VP1, VP2 and VP3 genes of Indian chicken anemia virus
Chicken anemia virus was detected by PCR in tissue samples collected from poultry flocks in Gujarat,India. The VP1, VP2 and VP3 gene sequences of CAV from Anand, Gujarat were obtained after cloning thePCR products in pDrive cloning vector. Nucleotide sequence alignment with other CAV sequencesdemonstrated overall identity of 95-98.8%, 98.8-99.8% and 98.8-100% for VP1, VP2 and VP3 regions,respec...
متن کاملNucleotide sequence analysis of the Second Internal Transcribed Spacer (ITS2) in Hyalomma anatolicum anatolicum in Iran
Ticks are important acarina that infest animals. They are obligatory blood sucker arthropods which economically impact cattle industry by reducing weight gain and production. Moreover, they are important vectors of viral, bacterial, rickettsial and parasitic pathogens infecting humans and animals. In view of the importance of Hyalomma anatolicum anatolicum in pathogen transmission, including Th...
متن کاملPhylogenetic Analysis of Three Long Non-coding RNA Genes: AK082072, AK043754 and AK082467
Now, it is clear that protein is just one of the most functional products produced by the eukaryotic genome. Indeed, a major part of the human genome is transcribed to non-coding sequences than to the coding sequence of the protein. In this study, we selected three long non-coding RNAs namely AK082072, AK043754 and AK082467 which show brain expression and local region conservation among vertebr...
متن کاملNucleotide sequence of cDNA encoding for preprochymosin in native goat (Capra hircus) from Iran
Prochymosin is one of the most important aspartic proteinases used as a milk-clotting enzyme in cheese production. In the present investigation we report sequence of cDNA encoding goat ( Capra hircus ) preprochymosin and compare its nucleotide and deduced amino acid sequences with sequences of other ruminants preprochymosin. As bovine prochymosin, the caprine prochymosin cDNA encodes 365 amino ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PLoS Computational Biology
دوره 3 شماره
صفحات -
تاریخ انتشار 2007